arXiv:1509.03046vl [cs.DS] 10 Sep 2015 


Explicit Bounds for Nondeterministically 
Testable Hypergraph Parameters 


Marek Karpinski’' Roland Marko’*’ 


Abstract 

In this note we give a new effective proof method for the equivalence of the notions 
of testability and nondeterministic testability for uniform hypergraph parameters. We 
provide the first effective upper bound on the sample complexity of any nondeter¬ 
ministically testable r-uniform hypergraph parameter as a function of the sample 
complexity of its witness parameter for arbitrary r. The dependence is of fhe form 
of an exponential tower function with the height linear in r. Our argument depends 
crucially on the new upper bounds for the r-cut norm of sampled r-uniform hyper¬ 
graphs. We employ also our approach for some other restricted classes of hypergraph 
parameters, and present some applications. 


1 Introduction 

The topic of property testing for combinatorial structures has gained considerable atten¬ 
tion in recent years. In this setting in the case of graphs the goal is for a given property that 
is invariant under relabeling nodes to separate via sampling the set of graphs that satisfy 
it from those that are far from having the property. The development in this direction 
resulted in a number of randomized sub-linear time algorithms for the corresponding de¬ 
cision problems, see [1], [2], [11], for the background in approximation theory of NP-hard 
problems for dense structures, see [3]. 

Several attempts were made to characterize the properties in terms of the sample 
size needed for carrying out the above task, an important class comprises of those that 
admit a sample size that is independent from the size of the input instance, we call 
these properties testable, in other works they are also referred to as strongly testable. 
One particular family was introduced by Lovasz and Vesztergombi [18] that consists of 
properties whose testability can be certified by some certain edge colorings, they called 
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these nondeterministically (ND in short) testable properties, and are the main subject of 
the current note. It was showed by the authors of [18] that ND-testability is equivalent 
to testability for graphs, the question regarding parameters instead of properties was 
also discussed. Subsequently, a constructive proof was given for the above equivalence 
by Gishboliner and Shapira [10]. The first treatment of parameters and properties of 
r-uniform hypergraphs (r-graphs in short) of higher order was carried out in [16], in that 
paper a proof was given for our main result Theorem 1.3 below that relied in part on 
non-effective methods by means of the machinery developed in [7] to describe the limit 
behavior of sequences of uniform hypergraphs. 

The current note is based on the framework and terminology of [16] by the authors, 
we will repeatedly refer to certain parts of [16] for details, but also focus on delivering an 
accurate picture by presenting the main steps here. 

We proceed by providing the necessary formal definitions of the parameter testability 
in the dense hypergraph model. 

Definition 1.1. An r-graph parameter f is testable if for any £ > 0 there exists a positive integer 
cjfie) such that for any q > q/ie) and simple r-graph G with at least q nodes we have 

F{\f{G)-f{G{q,G))\>E)<E, 

where G{q,G) denotes the induced subgraph on a subset S c V(G) of size q chosen uniformly 
at random. The infimum of the functions qj satisfying the above inequality is called the sample 
complexity of f. The testability of parameters ofk-colored r-graphs is defined analogously. 

One may relax the conditions of Definition 1.1 to introduce a certain version of nonde- 
terministic testability. The definition below was first formulated in [18]. 

Definition 1.2. An r-graph parameter f is non-deterministically testable (ND-testable) if there 
exist an integer k and a testable Ik-colored r-graph parameter g called witness such that for any 
simple graph G the value f{G) = maxc y(G) where the maximum goes over the set of k-colorings 
ofG (see Section 2 for the definition of a k-coloring). 

Our main result is the following. It was proved the first time in [16] without any 
upper bound for the function q/ in general, prior to that the case of r = 2 was resolved by 
non-effective ([18]), and effective ([10],[15]) methods. 

Theorem 1.3. Every non-deterministically testable r-graph parameter f is testable. If g is the 
parameter of k-edge-colored r-graphs that certifies the testability of f, then we have qf{e) < 
exp^'^^^~^l"^^\crjcqg{e)le) for some constant Cyjc > 0 depending only on r and k, but not on f or 
g. Here exp^^^ denotes the t-fold iteration of the exponential function for t > 1, and exp^°^ is the 
identity function. 

This note is organized as follows. We proceed by providing the necessarily notation in 
Section 2, in the subsequent Section 3 we state and prove the main technical development 
that enables us to discard the non-effective tools featured in the proof of the main result 
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of [16]. In Section 4 we outline the argument ot [16] that is slightly modified in order to 
adapt the framework to the concepts of the previous section, and present the proof of the 
main result Theorem 1.3. We illustrate a special case of ND-testable parameters, where 
an improvement on the sample complexity dependence is possible in Section 5. In the 
subsequent Section 6 some applications of the main result are shown, that are followed 
by a discussion together with the description of directions of possible further research in 
Section 7. 


2 Preliminaries 

Hypergraphs and colored hypergraphs 

Simple r-uniform hypergraphs, r-graphs in short, on n vertices forming the family are 
subsets G of (^”^), the size of such a G is n, and the elements ot are r-edges. Let khe a 
positive integer, and let denote the set of ^-colored r-graphs ot size n, that are partitions 
G = {G‘^)ae[k] ot into k classes, we say that color a assigned to e (G(e) = a) whenever 
e G G“. In this sense simple r-graphs are regarded as 2-colored. Additionally we have 
to introduce the special color l for loop edges that are multisets ot [n] with cardinality r 
having at least one element that has a multiplicity at least 2. For any finite set G the term 
G-colored graph is defined analogously to the fc-colored case. 

A fc-coZormy of a f-colored r-graph G = {G“)ae[t] is a ffc-colored r-graph G = {GG’^'')ae[t],^e[k] 
with colors from the set [f] X [k], where each of the original color classes indexed by n G [f] 
can be retrieved by taking the union of the new classes corresponding to {a, jS) over all 
jS G [k], thatisG" = This last operation is called fc-dZscoZormg of a [f] x [k]-colored 

graph, we denote it by [G,Z:] = G. We will sometimes write ffc-colored for [f] x [fc]-colored 
graphs when it is clear from the context what we mean. 

Let q > 1 and G G then G(^^, G) denotes the random r-graph on q vertices that is 
obtained by picking a subset S ot [n] of cardinality q uniformly at random and taking the 
induced subgraph G[S]. For any F G and G G the F-density of G is defined as 
f(F,G) = P(F = G(^?,G)). 


Graphons 

Next we provide the description of the continuous generalization ot r-graphs. We require 
some basic notation from the dense graph limit theory, and refer to Lovasz [17] tor an 
extensive overview of recent developments in this topic. 

For a finite set S, let h(S) denote the set of nonempty subsets of S, and h(S, m) the 
set of nonempty subsets of S of cardinality at most m. A 2'' - 1-dimensional real vector 
Xh{s) denotes (xt^, ..., where Ti,..., is a fixed ordering ot the nonempty subsets 
of S with T 2 r-i = S, for a permutation n of the elements of S the vector Xn(h(s)) means 
(Xn'iTi), • • • /^T:'(T 2 r_i))/ where n' is the action of n permuting the subsets of S. 
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Let the r-kemel space 'Wg denote the space of the bounded measurable functions of 
the form W: [0, subspace 'W of 'Wg the symmetric r-kemels that 

are invariant under coordinate permutations induced by 7i G Sr, that is W(xh([r],r-i)) = 
i^(^ 7 T(h([r],r-i))) for each n e Sr- We will refer to this invariance in the paper both for r- 
kernels and for measurable subsets of [0,as r-symmetric. Assume that the functions 
W G 'W'j take their values in the interval I, for I = [0,1] we call these special symmetric 
r-kernels r-graphons. In what follows, A always denotes the usual Lebesgue measure in 

where d is everywhere clear from the context. 

Analogously to the graph case we define the space of k-colored r-graphons whose 
elements are referred to as W = (W“)ae[jc] with each of the W“ components being an r- 
graphon. The special color i that stands for the absence of colors has to be also employed 
in this setting as rectangles on the diagonal correspond to loop edges, see below for the case 
when we represent a fc-colored r-graph as a graphon. The corresponding r-graphon W' is 
{0, l}-valued. Furthermore, W has to satisfy Tuaem W“(x) = 1 everywhere on [0, 

For X G [0,the expression W(x) denotes the color at x, we have W(x) = a whenever 

TJi=l W'(Xh([r],r-l)) ^ X[r] < Ef=l W'(Xh([r],r-l))- 

Similar to the discrete case, a k-coloring of W G is a ffc-colored r-graphon W = 
{W^“'^'')ae[t],^e[k] with colors from the set [t] X [k] so that Lae[t],jSeM = W“(x) for each 

X G [0, and a e[t]. The fc-discoloring [W,fc] of W and the term C-colored graphon 

is defined analogously, and simple r-graphons are treated as 2-colored. 

For q > 1 and W G the random [fc]-colored r-graph G(^^, W) is generated as 
follows. The vertex set of G{q, W) is [q], first we have to pick uniformly a random point 
(Xs)seh([(j],r-i) s [0, then conditioned on this choice we conduct independent trials 

to determine the color of each edge e G with the distribution given by Pe(G(^^, W)(e) = 
a) = W“(Xh(e,r-i)) corresponding to e. Recall that l is a special color which we want to avoid 
in most cases during the sampling process, therefore we will highlight the conditions that 
have to be imposed on the above random variables so that G(^^, W) G 

For F G the F-density of W is defined as f(F, W) = P(F = G{q, W)), which can be 
written following the above definition of the sampled random graph as 

f(F,W)= r n Wf<'^)(Xh(.,.-i))dA(Xh([, ],,-!)). 

We can associate to each G G an element Wq e by subdividing the 

unit cube [0, into n'' small cubes the natural way and defining the function W' : 

[0, ) [k] that takes the value G({h, • • •, U}) on [^, x • • • x [^, |] for distinct 

h, • • •, U, and the value l on the remaining diagonal cubes, we will call such functions naive 
r-graphons. Then set (WG)“(Xh([r],j-i)) = {ph{[r],i)(^H[r],r-i))) = n) for each a e [k]U {l}, 

where Ph([r],i) is the projection to the suitable coordinates. The special color l here stands 
for the absence of colors has to be employed in this setting as rectangles on the diagonal 
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correspond to loop edges. The corresponding r-graphon W‘ is {0, l}-valued. Note that 

|f(F,G)-f(F,WG)l<dlL (2.1) 

” -( 2 ) 

for each F G hence the representation as naive graphons is compatible in the sense 
that lim„^oo f(F, G„) = lim„^oo f(F, Wg„) for any sequence with |1/(G„)| tending to 

infinity. 


Norms and distances 


The definitions of the relevant norms is given next. 
Definition 2.1. The cut norm of an r-kernel W is 


||W||n,, = sup 

ie[r] 

where the supremum is taken over {r - l)-symmetric measurable sets Si, and p^ is the natural 
projection from onto [0,1]^^^'- Furthermore, for an {r - T)-symmetric partition P = 

(PfUi the cut-P-norm of an r-kernel is defined by the formula 

t 

= sup y\ 

y .;v=i 

ie[r] 

where the supremum is taken over sets S, that satisfy the usual symmetries. 

We remark that with the above definition it is also true that 


I W^(Xh([r],)'-l))dA(Xh([j'],)'-l)) / 

P r\ie[r]pT,JSir\P j.) 




a,r = sup 

/i. fi: 


-[0,1] 


/ n 

7=1 


Ji (P[f]\|i1 (^)) ^^(^h([r],r-l) )d a (Xh([r],)—1)) 




where the supremum goes over (r - l)-symmetric functions f, and similarly for any 
(r — l)-symmetric partition P = of [0, l]^(Wi]) .^^e have 


a,r,p = sup V 

fl . fr: [0,l]h([>-ll)^[0,l] 


A relaxed variant of the r-cut norm is 


I /i(^h([r]\|/l))Ip),(^h([r]\i!|))lV(^h([r],r-l))dA(x) 

i=l 


sup 


I Yl /''(^h([)-]\|i)))W(Xh([r],r-l))dA(Xh([r],r-l)) 
!=1 
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It is straightforward that 2“''||W||H^r < ||IV||n^r for every r and r-kemel W. Note that 

for r = 2 we have ||W||h ^2 = IIT^wlU^i/ where is the integral operator from L“([0,1]) to 
L^([0,1]) with the kernel W. 

We also mention that in several previous papers, see e.g. [2], the cut norm for r- 
arrays denotes a term that is significantly different from the one in Definition 2.1 and is 
not suitable for our present purposes. The above norms give rise to a distance between 
r-graphons, and analogously for r-graphs. 


Definition 2.2. For two k-colored r-graphons U = {U“)ae[k] W = (W“)ag[q, their cut distance 
is defined as 

k 

a=l 


and their cut-'P-distance as 


k 

a=l 


For two k-colored r-graphs G = {G“)ae[k] (ind H = {F[°‘)ae[k] their corresponding distances are 
defined as 


dn,,(G,H) = do,,(WG,WH), 


and 


dn,r,p{G,H) - dn,r,p(WG, Wh). 


Distances between an r-graph and an r-graphon, as well as in the case of r-kernels, are defined 
analogously. 

A generalization of the notion of a step function in the case of 2-graphons (see [5]) to 
the situation where we deal with r-graphons is given next. For a partition P the number 
of its classes is denoted by t<p. 

Definition 2.3. We call an k-colored r-graphon W with r > I an {r,l)-step function if there exist 
positive integers U, f/+i,..., f, =k,an l-symmetric partition P = (Pi,..., Pf,) o/ [0, and real 
arrays A" : ^ [0,1] with a G [ts]for I + 1 < s < r such that JLaeUs] ^“(4([s],s-i)) = 1 

for any choice o/ih([s],s-i) and for s < r so that W“ for a G [k] is of the following form for each 
a G [k]. 


hs\ 

^“(^h([r],r-l)) = Yj ^r(G([r],r-l)) 

«S=1 

Sc[r],/<|S| 


n 

S.(W) 


IPig (^h(S)) 

Sc[r] ;=1 
/+l<|S|<r 


< Xs < 


k 


/=1 


We refer to the partition P as the steps ofW. 
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The most simple example is the (r,r - 1) step function that can be written as 

tr-i r 

lV“(Xh([rLr-l)) = ^ , ZV) Ip.,{Xh{[r]\{j])), 

ii,...,ir=l ;=1 

where V is an (r - l)-symmetric partition. 

We speak of naive sfep functions when the arrays in the above definition have {0,1} 
entries. In this case a color of an edge is defermined by the membership of if subfuples 
in the classes of V. Basically, an (r, /)-step function can be regarded as an interpolation 
between discrete and continuous objects, since y^{yh{[i])r •) is discrete for any fixed yh([i]) s 
[0,i]Mra). 

Auxiliary lemmas 

We require three technical lemmas from [16], we refer fo Sections 3 and 4 in [16] for fhe 
proofs. The first asserts that any r-graphon can be approximated in a certain sense by an 
r-graphon of bounded complexify in ferms of the permitted error and is a variant of fhe 
Regularity Lemma. 

Lemma 2.4. For every r,t,k > 1, e > 0, and k-colored r-graphon W there exists an {r - 1)- 

symmetric partition P = (Pi,.. .,Pm) of [0, l]^([''“i]) into m < = treg{r,k, e, t) parts 

and an {r - l)-symmetric {r, r - 1)- step function V G with steps from P, such that for any 
partition Q of[0, l]^(Wi]) most mt classes we have 

V) < £. 

The second lemma gives a sufficient condition for the existence of a coloring of a given 
r-graphon that is close to a fixed colored r-graphon. 

Lemma 2.5. Let e > 0, \J be a t-colored r-graphon that is an (r, r - l)-step function with steps 
P = (Pi,..., Pm) and V be a t-colored r-graphon with da,r,p{U, V) < e. For any k > 1 and U a 
[f] X [k]-colored r-graphon that is an (r, r - l)-step function with steps from P such that [U, k] = \J 
there exists a k-coloring ofV denoted by V so that 

dn^,^p(U, V) < ke. 

Let dtv denote the total variation distance between probability measures on Let 
p{q, G) and p{q, W) denote the probability measure of G(^^, G) and G(^^, W) respectively. 
From (2.1) it follows for each G G Q'f that 

kLq^ 

dtv(p(^?, Wg), piq, G))<^. (2.2) 

The third statement provides a an upper bound on the total variation distance of fhe 
probabilify measures of random r-graphs regarding their cut distance. 
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Lemma 2.6. If IJ and W are two k-colored r-graphons, then 

dUt^{q,W),p{q,V)) < ^rfn,.(U,W), 

and there exists a coupling in form of Gi and G 2 of the random r-graphs G{q, W) and G{q, U) 
respectively, such that 

P(Gi ^ G2) < ^rfn,,(U,W). 


3 Effective upper bound for the r-cut norm of a sampled 
r-graph 


We are going to establish upper and lower bounds for the r-cut norm of an r-kernel using 
certain subgraph densities. Let W be an r-kemel, and H c be a simple r-graph on q 
vertices, define 

t*{H, W)= f FT W(Xh(e,r-l))dA(x), 

J[04]h(W/-i) 

this expression is a variant of the subgraph densities discussed above in Section 2. Using 
the previously introduced terminology we can write 

r(H,W)= t(F,W). 

HcFc(?) 


Let Ff denote the simple r-graph that is the 2-fold blow-up of the r-graph consisting 
of r vertices and one edge. That is, V{Kf) = {v^,... ... , 0 }} and E{Ff) = {{v^^, : 

ii,.. .,ir e {0,1}}, alternatively we may regard as a subset of 

It was shown in Borgs et al. [5] for r = 2 with tools from functional analysis that for 
any symmetric 2-kernel W with ||W||oo < 1 we have that 


w) < iiwl ,2 < [ml w)f/\ ( 3 . 1 ) 

where [t*{Kl .)]^/^is called the trace norm or the Schatten norm of the integral operator Tyj. 
We remark that in the above case stands for the 4-cycle. Furthermore, it is not hard to 
show that for any r and r-kemel W is holds that t*(Kl W) > 0. It holds that 


f(K%W)= f P] W(x 




)dA(x) 


= frfr n 

J[oa]"i Ji0Af2 mmu 


W(x, 




)dA{xTf)dA{xT,) 
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rjr, n 

j[0,lfi [J[0,l]’2 . t (= 

f r- n 

Jro.ll 2 ,• ; t 




/'i,...,ir-ie(0,ll 




ll/-v'r-l£|0,l} 


. v'r_l,yA,r-l))^^^^T]) 


dA(xTi) 


r r.T n ...«,.|,,-I))d 2 te 3 \r,) 

Jio.ifi [Jw.ifin ,,.u I, 


dA(xTi), 


where Ti = h(V(K^) \ {z;°, vjj, r - 1), = h(y(K^), r - 1) \ h(V(K^) \ r - 1) is the subset 

of h{V{K^), r - 1) whose elements contain v^, but not for i e {0, 1}, T 2 = T°U T^, and 
T 3 = h{V{K^) \ {r;°, v]} U {u}, r - 1). We used Fubini's fheorem, fhaf enabled us fo infegrafe 
firsf over coordinafes wifh indices from T 2 , which we could fhen use fo identify and vj. 

In the proof of (3.1) fhe aufhors drew on fools from functional analysis and fhe facf fhaf 
a 2 -kemel describes an infegral operafor, fhose concepfs do nof have a nafural counferparf 
for r-kemels. However we can provide an analogous resulf by fhe repeafed application 
of Fubini's theorem and the Cauchy-Schwarz inequality in the L^-space. 

Lemma 3.1. For any r >\ and r-kernel W with ||W||oo < 1 we have 

2-^nK^, W) < II W||o,, < [FiK^, W)]i/2\ (3.2) 

Proof. The lower bound on ||W||n^^ is straightforward, and could even be replaced by 
any other simple r-graph, we only need to use 2“''||W||H^r < II W||n^,. 

For the other direction, let us fix a collection of arbifrary symmefric measurable func¬ 
tions /i,..., /r: [ 0 , l]^(Wi]) [Q^ 5 el; y = ..., u,} and for any I > 1 and q,..., 1 / e { 0 , 1 } 

lef 

." = {v[\...,vfVi+i,...,Vr}. 

Furfher, lef Vj = V \ {vf and for ) > / -l-1 lef = y^.-.n \ pef us infroduce the index 
sets Ti = h(V'i), Si = h(V', r - 1 ) \ Ti and for 1 < / < r fhe sefs 


S? = l(c \ {!>,!) u |B?||c e S,|, s; = |(c \ {!>,!) u IbJIIc e S,|, 
Tm = u„.ieiojiMv, 

and 


S,+i = {Ti U 5° U Sj) \ Tui. 

Then we have 


I ]~[ //(^h(y,))W(Xh(V,r-l))dA(Xh(V,r-l)) 

J[0,l]h(V,r-l) 
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I fl{Xh(Vi)) I ]~[ /;(^h(y^))W(Xh(y,r-l))dA(XSi 


)dA(xTi) 


/[ 0 , 1]"1 7=2 
ll /2 


I /:f(^h(yi))A(xri) I I ]~[ /j(Xh(y^.))W(Xh(y,r-i))dA(xsj) 

L V 

Y ( r ^ 

- I I ,0 TT//(^h(yO))M^h(yo,r-i))dA(Xso) 

.J[o,i]’’i d[o,i] 1 7=t ’ 


dA(xTi) 


1/2 


I cl TT /;(^h(yi))W(^h(yi,r-i))dA(xsi 

J[04]®i %2 ^ 


dA(xTi) 


1/2 


where we used H/ilU < 1 and the identity f (f f{x, y)dyYdx = f f{x, y)f{x,z)dydzdx in the 
previous inequality. We proceed by upper bounding the last expression through repeated 
application of this reformulation combined with Cauchy-Schwartz. 


r ^ ri .‘'- 1 )^ 

li. Lem) 


I n n //K(y-i./-i))^(^h(W..M))dA(xs,) 

.tie( 0 ,l}H+l ^ 


dA(xT,) 


2t-l 


< 





\ 

2 

f . 

n 

fl(x 

h(y;i.''-p^ 

A(xt,) 

J[0,1] ' 


) 

- 



r. 

( 

r. 

n 




d[0,l]''' u 

\ il/- 

v!(-ie(0,ll ;■ 


dA(xr,) 


< 


r / r 

I I co n n //K(y'i ./-i'“))^(^h(Vi.■M-0))dA(Xso) 

lJmf‘[Jmfii^ .Lei041/=/A '/ ' 

/ r 

I cl n n //K(y'i.-<-ih)^(^h(Vi..M-i))dA(xsi) 

.te(o,i)H+i 

I TucOuci n n//K{y'i ./))^(^h(Vi..o)dA(xso)dA(xsi)dA(xr,) 

J[o,if4 J4o,i};=w ^ 

r 

J T„. n /'+iK(y;r'')^ 


dA(xT,) 


LU[o,i]^'+i 


V/i.!ie{ 0 ,ll 


10 



< 





,,,j)W(Xh(y.i.;,))dA(xs,^i) 

/ 


2 ^ 


dA(xT,,i) 




W(Xh(y.i.,.))dA(xs,) 


hf-Jr^lOAl 




where in subsequent inequalities we first used the Cauchy-Schwarz inequality, and af¬ 
terwards that WfjWoo < 1 for any j e [r]. As the test functions /i,...were arbitrary the 
statement of the lemma follows. □ 


Utilizing the previous result we can obtain a quantitative upper bound on the cut norm 
of the sampled kernel for arbitrary r. 

Lemma 3.2. Let r,k > 1. For any £ > 0 and f > 1 there exists an integer ^?cut(h U f) 
c(l/e)^ 'k^r^for some universal constant c > 0 such that for any k-tuple ofr-kernels Ui, ...,Uk 
that take values in [-1,1 ], and any integer q > qcutiLK £, 0 it holds with probability at least 1- e 


that if 

1=1 

then 

k 


sup Y \\WG(q,Ui)\\n,r,Q < £■ 
Q,tQ<t l-l 


where the supremum at both places goes over symmetric partitions Q of[0,1 ]^^^'' into at most t 
classes. 


Proof. Let r,k,t >1 and £ > 0 be fixed, and let Lfi ,... ,Ui and q be arbitrary It is a standard 
sampling result that for any r-kernel U, positive integer q, and F E Q'' we have that 

P(|f(f, U) - f{F,C{q, U))| > 6) < 2exp(-U^) 

for any 6 > 0, in particular for F = we have 

nimf U) - FiKfGiq, U))\ > 6/2) < 2exp(-^). 

Then we can estimate sup^ Y!i=i \\^G{q,Ui)\\n,r,Q using Lemma 3.1. Set 6 = ^ , and 

let q be as large such that 2k exp < £■ Let A denote the set of all r-arrays of size t 

with {-1,1} entries. Then we have 
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k 

sup Y \\Wc(ci,Ui)\\n,r,Q 
Q,tQ<t l-l 

= sup max sup 

Q,tQ<t 

’ jeWem 

k t p r 

E E A{ii, ...Jr) I ^C{q,Ui)iXh{[rlr-l)) ^T'nQi. (^h([r]\i;)))dA(Xh([r],r-l)) 

1=1 tl,...,lr=l [04]h(M,>-l) 

k 

< f ^ \ \AfG(q,Ui)\\n,r 
1=1 

1=1 

1=1 

k 

<fY^{r\\Ul\\a,r + 5 )^^^' <£r 
1=1 

and the assumptions of the calculation, in particular the fourth inequality, hold true with 
probability at least 1 - £. For convenience, the first inequality is true by definition, the 
third holds by (2.1), whereas the second and the fifth are the consequence of Lemma 3.1. 

□ 


4 Proof of the main result 

The nexf lemma is a crucial componenf in fhe proof of fhe main resulf. 

Lemma 4 . 1 . For every rJ,k,qo > 1 and 6 > 0 there exists an integer ^^tv = ^ltv(L 6 ,^^ 0 /f/^) ^ 1 
such that for every q > q^y the following holds. Let U = {UJae[t] bs n t-colored r-graphon and let 
V" denote y^G(q,u‘‘)for si^ch a g [f], also let V = {VJae[t], so WGq,u) = V. Then with probability at 
least 1-6 there exist for every k-coloring V = ofV a k-coloring U = 

o/U = {UJae[t] such that we have 


dtv(p(^?o/V),p(^^0/U)) < 5. 
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The bound qtv{r, 6, qo, Tk) can be chosen in a way so that q^(r, b, qo, t,k) < exp^^^'' Cr{j)^{kt)^‘>o 

for some constant Cr > 0 only depending on the dimension r. 


The proof is to large extent identical to the proof of Lemma 5.1 in Karpinski and Marko 
[16], the only part that is changed is where we replace the non-effective ultralimit method 
used in that proof by Lemma 3.2. However, the two statements that are exchanged do not 
coincide, thus some technical adjustment needs to be carried out. Next we present the 
sketch of the proof of Lemma 4.1 by outlining the main steps, for the details we refer to 
Lemma 5.1 in Karpinski and Marko [16]. 


Proof. We proceed by induction with respect to r. The case of r = 1 can be verified the same 

way as in [16], and qtvO-, b, qo, t, k) = — satisfies the conditions of the lemma. 

Now assume that we have already verified the statement of the lemma for r - 1 and 
any other choice of the other parameters of ^^tv• We will conduct the proof for the case tor 
r-graphons, therefore let b > 0, t,k,qo > 1 be arbitrary and fixed, q is to be determined 
below and let U, V, and V be as in the conditions of the lemma. We outline the steps in 
order to obtain a ^-coloring U for U. 

Let A = n{r,b,qo,t,k) = Set f 2 = treg{r,tk, A,l) and h = freg(r, f, 12 ), 


4/c(/:f )^o 


and define qtv{r,b,qo,t,k) = max{^^tv(?' “ A, f 2 )}- Note that f 2 ^ 

exp^^^(c(l/A)^) and fi < exp^^^(c(l/A)^) for a large enough constant c > 0. We also as¬ 
sume that qiv{r - 1, b, qo, t, k) < exp^‘^^(c,_i ) for some positive integer d and real c,-i > 0, 

where A' = n(r - 1, b, qo, t, k) = 


^k(kt/o ' 


. Then it follows 


qtvir - l,blT,qo,h,t2) < exp^‘^*^\cr{lIAf) 


(4.1) 


for some c, > 0. Since we can adjust the constant factor c,-i in a way that qtvir - 
l,b/A,qo,ti,t 2 ) > qcutiATKh)} tor any possible choice of the parameters we conclude 
that qtviT b, qo, t,k) is upper bounded by ^ > q^^{r, b, qo, t,k) be arbi¬ 

trary. We describe now the step for the construction of U that satisfies the conditions of 
the lemma. 


We approximate V by some function Z that is only given implicitly by means of 
Lemma 2.4. We have 


sup da,r,Q{y, Z) < A, 

Q,tci<tK 


where Ti denotes the set of the steps of Z, and < f 2 holds. 


• We set Z = [Z, k], consequently 


sup da,r,Q{y, Z) < A. 

Q,tQ<tji 

Note that Z and Z depend on V. 


(4.2) 
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• We apply again Lemma 2.4 with the proximity parameter A/2 to r-graphon U to 
approximate it by Wi = (Wj,..., Wp with steps in f* that satisfies 

sup rfn,r,Q(Wi,U) < 

Q,tQ<tpt2 

where the supremum runs over all (r - l)-symmetric partitions Q of [ 0 , with 

at most tpt 2 classes, and t<p <ti. 

• Define W 2 = to be the r-graphon representing G(^^, Wi), so W® represents 

G{q, VJ“) for each a G [f]. The steps of W 2 are denoted by V . Then it follows from 
Lemma 3.2 that 

sup da,r^{W 2 , V) < A, 

Q,tQ<t2 

with probability at least 1 - A, so consequently 

rfn.,«(W 2 , V) < A, 

with the same failure probability. Furthermore, with (4.2) we have 

rfn,,«(W2,Z)<2A. (4.3) 


• We define the ^-coloring W 2 of W 2 via Lemma 2.5, which by (4.3) certifies the 
existence of a ^-coloring such that 

< 2kA. 

The graphon W 2 is a symmetric step function with steps that form the coarsest 
partition that refines both P' and P, we denote this (r - l)-symmetric partition of 
[0, by S, the number of its classes satisfies ts = t<pd<ji < fif 2 - 

• We construct the ^-coloring Wi of Wi using the hypothesis that the current lemma 
is true for the case of r - 1 and the arbitrary choice of all other parameters. For the 
details we refer to the proof in [16]. The r-graphon Wi we obtained satisfies 

rftv(iU(^?o,Wi),p(^?o,W2)) < 6/4 

with probability at least 1 - 6/4. Also, Wi has at most t<pt 2 steps that refine P. 

• Lemma 2.5 provides the existence of U with [U,fc] = U with the bound as dnr(U,Wi) < 

2 • 


We conclude the proof by invoking Lemma 2.6 to verify that U satisfies the conditions 
of the lemma, 

^^tv(p(^?0/V),p(^^0/U)) < 6, 


and the failure probability is at most 6 . 


□ 
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Proof of Theorem 1.3. We proceed completely identically to the proof of the main result of 
[16], we only have to substitute the current Lemma 4.1 for Lemma 5.1 in that paper, we 
only give a brief overview here, we refer for details to [16]. Set qo = qgis/T). The main 
observation is that provided the result of Lemma 4.1 we can find for any coloring F of f a 
coloring G of G such that the distributions of G(^^o/ Wq) and G(^^o/ Wp) are close, hence the 
random objects given by them can be coupled in a way so that they coincide with high 
probability. Applying this together with the triangle inequality 

\g{G) - y(F)| < |y(G) - g{G{qo, G))| + \g{G{qo, Wg)) - g{G{qo, G))| 

+ |y(G(^?o, Wg)) - g{G{qo,Wr))\ + \g{G{qo,T)) - g{G{qo,Wr))\ 

+ \g(F)-g{G{qo,m 

and the testability property of g together with (2.2) gives the desired result. □ 


5 Parameters depending on densities of linear hypergraphs 

We present a special case of the above notion of ND-testability that preserves several 
useful properties of the graph case, r = 2. Restricting our attention to this sub-class we 
are able to essentially remove the dependence on r in the bound given by Theorem 1.3 on 
the sample complexity. 

A linear r-graph is an r-graph that satisfies that any distinct pair of its edges intersect 
at most in one vertex. A linear fc-colored r-graph has absent edges, if we disregard the 
colors of the edges present, then they form a linear r-graph. We call an r-graph parameter 
linearly ND-testable if it is ND-testable and its witness parameter does only depend on the 
f*-densities of linear hypergraphs. 

In this section we depart from the graphon notion and use instead objects called naive 
r-graphons and naive r-kernels. These differ from true graphons and kernels in their domain 
that is the r-dimensional unit cube and whose coordinates correspond to nodes of r-edges 
instead of any proper subset of the set of nodes of an r-edge. They can be transformed 
into true graphons by adding dimension to the domain in a way that the values taken do 
not depend on the entries corresponding to the new dimensions. This way we can think 
of naive graphons as a special subclass of graphons, sampling is defined analogously to 
the general case. Note that for r = 2 the naive notion does not introduce any restriction as 
all proper subsets of a 2-element set are singletons. We require the notion of ground state 
energies of r-graphs, naive r-graphons, and kernels form [6], see also [2]. 

Let s > 1 / be an r-array of size s, and G be an arbitrary r-graph. Define the ground 
state energy (GSE) (see [6]) of the r-graph G with respect to the r-array / by 

s p r 

f(G,/) = max Y J{ii,...,ir) I FFlg. (xy)WG(xi,.. .,x,)dx, (5.1) 
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where the maximum runs over all partitions Q of [0,1] into s parts. Analogously, define 
the GSE of a naive r-kernel U with respect to / by 

s p, r 

T{U, J) = max Y /(h, FT fi.{Xj)U{xi, ..., x,)dx, 

f )=r 

where the maximum runs over all fractional partitions / of [0, 1 ] into s parts. 

The next result was first proved in [2], subsequently refined in [14]. 

Theorem 5.1. Let r > 1, s > 1, and 6 > 0. Then for any r-kernel U, real r-array J, and 
q log(©) with © = 22 ^ have that 

¥i\TiU,})-t{Giq,U),})\ > 6 ||Lf|U) < 2exp|-0j. (5.2) 

We require the version of the norms and distances given in Section 2 for the naive 
setting. 

Definition 5.2. The cut-*-norm of a naive r-kernel W is 

||W|r^, = sup r W(x)dA(x) , 

SiCPal/iel!"] JSiX-XSr 

where the supremum is taken over measurable sets S, c [0, l]/or each i g [r], Turthermore, for a 
partition f* = of[0, 1 ] the cut-{*,P)-norm of a naive r-kernel is defined by 

\\^\\u,r ,9 = sup y f W(x)dA(x) 

SiCpai./eM J(SinPypx-x(S,nPy,) 

The cut-{*,P)-distance d*^^^ of graphs and graphons is defined analogously to Definition 2.2 
exchanging the cut-'P-norm for the cut-{*,P)-norm. 

The definition for the fc-colored version is analogous. 

We require the following auxiliary lemmas that are analogous to Lemma 2.4, Lemma 2.6, 
and Lemma 2.5, respectively (with analogous proofs). 

Lemma 5.3. Tor every r>l, £>0, t>l, k>l and k-colored r-graphon W there exists a 

partition P = (Pi,..., P^) of [ 0 , 1 ] into m < ( 2 f)^''^+^^ ' = freg(h k, e, t) parts and a naive (r, 1 )- 
step function V G with steps from P, such that for any partition Q of[0, 1] into at most mt 

classes we have 

V) < £. 
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Lemma 5.4. Let U and Vsf he k-colored r-kernels with ||fi|U, IIWIU < 1. Then for every linear 
k-colored r-graph F we have 


\t\F,W)-t\F,U)\<i^^dfMW). 

Lemma 5.5. Let k > 1, e > 0, U be a step function with steps P = (Pi ,... ,Pt) and V be a 
r-graphon with df^p{U,V) < e. For any k-colored r-graphon U = {U^,... ,U’^) that is a step 
function with steps from P and a k-coloring of U there exists a k-coloring V = {V^,V^) of V 
so that d*^/V,V) = ELi < ke. 

Next we state and prove the main contribution of this section. 

Theorem 5.6. Let f be a linearly ND-testable r-graph parameter with witness parameter g of 
k-colored r-graphs, and let the corresponding sample complexity be qg. Then f is testable with 
sample complexity qj, and there exists a constant c > 0 only depending on k and r but not on f or 
g such that for any e > Owe have 

qf{£) < exp^^\cql{£l2)). (5.3) 

Proof. The proof is almost identical to case of graphs in Karpinski and Marko [15], we 
will sketch it in the framework of Lemma 4.1, from there the statement follows a similar 
way as the proof of Theorem 1.3. The main distinction between the general setting and 
the current linear setting is that we do not require for each coloring V of V to have a 
corresponding coloring U of Lf such that their distribution are close in the 

total variation distance, here it is enough to impose that they are close in This 
relaxed condition implies that the conditional ^^o"Sa^npled distributions are close, where 
the condition comprises the densities of linear sub-hypergraphs. The different norm 
employed in the measurements of the proximity allows us to remove the inductive part 
that is contained in the general proof in Lemma 4.1. 

Let / and g be such as in the statement of the theorem, and let G be an arbitrary 
r-graph and Wg a 3-colored naive r-graphon that represents it (the colors correspond to 
edges, non-edges, and diagonal entries respectively). Let q > exp^^\cqg{£l2)) for some 
c > 0 that is chosen large enough, and let F denote the random r-graph G(^^, G), and let 
Wf be its 3-colored representative graphon. It is easy to see as in the general case that 
f{F) > /(G) - £/4 with probability at least 1 - £/4, in fact this is even true with much 
smaller q. 

We will show first that with probability at least 1 - £/4 there exist for every ^-coloring 
V = iV“'%e[ 3 ],iie[k] of Wf a ^-coloring U = {U‘^'^)ae[ 3 ],i 3 e[k] of Wg such that ^□,,q(U, V) < A, 
where A = exp(-c'^^^(£/2)). Let Wi be a naive r-graphon that satisfies 

sup dl^^^iWcW,) < AI8K (5.4) 

tQ<tpt2 
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by Lemma 5.3 there exists such a naive (r, l)-step function with at most f i = fregCh 2, A/8fc, t 2 ) 
steps that are denoted by P, where fi = LegCh 3fc, A/8k, 1). Further, let W' be the naive (r, 1)- 
step function associated with G{q,Wi) with its steps forming fhe partition P”. There 
exisfs a measure-preserving permufafion (p of [0, 1 ] such thaf W 2 given by W 2 (xi,..., x,) = 
W'((^(xi),..., <p(Xr)) is anofher valid represenfafion of G(^^, Wi) with steps P', and having 
the additional property that the measure of the set where Wi and W 2 differ is af mosf 
Iri particular by the choice of q if is frue that ||Wi - IV 2 II 1 < A/8k with 
probability at least 1 - e/8. 

Further, the bound in (5.4) can be rewritten as a GSE problem in the sense of (5.1), 
applying Theorem 5.1 leads fo fhe assertion fhat 

sup W 2 ) < A/Ak, (5.5) 

tQ<tf>t2 

with probability at least 1 - A/8k, which is larger than 1 - £/ 8 . 

We condition on the aforementioned two events, they occur jointly with probability 
at least 1 - £/4. Now let V be an arbitrary ^-coloring of Wf, if follows fhaf fhere exisfs a 
3fc-colored naive (r, l)-sfep function Z = {Z‘^'^)ae[ 3 ],^e[k] with steps forming P such fhaf 

supd;,.^(V,Z)<A/4/c, (5.6) 

and tji < t 2 . Let the naive r-graphon Z denote the fc-discoloring of Z. Then we have 

supdl^Q{WF,Z)<A/8k, (5.7) 

and together with (5.5) it follows fhaf 

supd‘^ ^^^(W 2 ,Z)< A/4L (5.8) 

An application of Lemma 5.5 fogefher wifh the bound in (5.8) ensures the existence of 
a ^-coloring W 2 of W 2 thaf is a naive (r, l)-step function with the steps comprising S that 
is the coarsest common refinement of P' and P, and that satisfies 

^^□,,(W 2 ,Z)<A. (5.9) 

Now we consfrucf a ^-coloring of Wi by simply copying W 2 on fhe sef on [U;(P, HP')]'', and 
defining if in arbitrary way on the rest of [ 0 , 1 ]'’ paying attention to keep it a ^-coloring of 
Wi and nof increase fhe number of sfeps above f.?^. For the Wi obtained this way we have 

II- W“'''||i = di(Wi, W 2 ) < A/4. (5.10) 

a,ji 

Employing again Lemma 5.5 with (5.4) we obtain a k-coloring U of Wg thaf satisfies 

rf;^,(U,Wi) < A, 
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hence 


rf;,,(U,V)<4A. 

With a further randomization we can form a proper ^-coloring G of G that satisfies 

V) < 5A. 


Finally, we use that 

|g(F) - g{G)\ < |g(F) - g{G{qg{£l4), F))| + \g{G) - g(G(^?,(£/4), G))| < £/2, 

whenever there exists a coupling of the random 2fc-colored r-graphs G(^^g(£/4), G) and 
G(^^g(£/4), F) appearing in the above formula such that their densities of linear subgraphs 
are equal with probability larger than £/2. Such a coupling exists by Lemma 5.4 and 
standard probabilistic assumptions, thus we have /(G) > /(f) - £/2 with probability at 
least 1 - £/4, that concludes the proof. 

□ 


6 Applications 

The characterization of testability of properties of r-uniform hypergraphs for r > 3 is 
a well-studied area, for instance it has been established by Rodl and Schacht [19] that 
hereditary properties (properties that are preserved under the removal of vertices) are 
testable generalizing the situation in the graph case. Nevertheless, several analogous 
question to the graph case have remained open. We present some of these in this section 
together with the proofs for positive results as an application of Theorem 1.3. 

6.1 Energies and partition problems 

We define a family of parameters of r-uniform hypergraphs that is a generalization of the 
ground state energies (GSE) of Borgs et al. [6] in the case of graphs (see also Section 5), 
for connections to statistical physics, in particular to the Ising and the Curie-Weiss model, 
see [6]. This notion encompasses several important graph optimization problems, such as 
the maximal cut density and multiway cut densities for graphs, therefore its testability is 
central to several applications. 

Definition 6.1. For an r-graph H c (^”^), a real r-array f of size q, and a symmetric partition 
V = (f /..., f ^) o/ we define the energy 

}) = — Jih, • • • / ifenir', FiPif, 
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where enir;Si,...,Sr) = \{{ui,...,Ur) e [nY\{ui,...,Ur} e Hand {ui,...,Uj-i,Uj+i,...,Ur} e 
Sj for all j = l,...,r}\. 

Let H = {H°‘)ae[k] he a k-colored r-uniform hypergraph on the vertex set [n] and J = {J°‘)ae[k] he 
a tuple of real r-arrays of size t with ||J||oo < 1. Then the energy for a partition f* as above is 

ae[k] 

The maximum of the energy over all partitions P of(l"\) is called the generalized ground state 
energy (GGSE) o/H with respect to J, and is denoted by 

6,_i(HJ) = max£^,,_i(HJ). 

A rather straightforward application of Theorem 1.3 gives us the testability of any 
GGSE. 

Corollary 6.2. Tor any r,q > 1 and real r-array J of size t the generalized ground state energy 
Sr-i(.,J) is a testable r-graph parameter. 

We note that this result was proved previously in [16], Theorem 3.15., the proof there 
used ultralimits and was therefore non-effective. The present corollary does not rely on 
such tools, we could provide an explicit upper bound on the sample complexity, and in 
this sense the result is new. 

The above problem of testing of the GGSE is a special case of the question regarding 
testability of general partition problems. These properties were first dealt with systemat¬ 
ically in the graph case in [11], where the authors showed their testability. They are also 
the most prominent family of non-trivial properties from the testing perspective in the 
dense model that are testable with polynomial sample complexity known to date. 

We sketch briefly the problem. Gonsider a vector of k positive reals adding up to 1 and 
a symmetric matrix of size k with entries from [0,1] together forming a so-called density 
tensor. The partition property associated to this tensor is satisfied by a graph whenever 
there exists a partition of its vertex set so that the densities of the class sizes equal the 
quantities given by the vector and the edge densities between the parts coincide with 
the corresponding entries of the matrix. A property associated to a family of tensors is 
satisfied whenever there exists a member of the family that is satisfied following the above 
description. Eor example, we can throw away from a density tensor the condition on the 
class sizes, or we can require that the edge densities between the classes lie in a certain 
interval to obtain another, relaxed partition problem. 

A test for the maximal cut density can be obtained from a collection of partition 
problems into two classes only constraining the edge density between the two distinct 
parts for each integer multiple of £ in [0,1]. 

Researched aimed at partition problems for hypergraphs was initiated by Eischer 
et al. [9] defining a framework that slightly extended the notions of [11]. In their setup the 
problem is formulated again as a question of existence of a vertex partition of a hypergraph 
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with prescribed sizes that satisfies that the r-partite sub-hypergraphs spanned by each r- 
tuple of classes contain a certain number of edges. The additional feature of the approach 
is that it can also handle tuples of uniform hypergraphs (perhaps of different order) that 
share a common vertex set that is the subject of the partitioning, the partition problem 
defined again by density tensors comprises of constraints on edge densities between 
classes for each of the component hypergraphs. In [9] it is shown that such properties are 
testable with polynomial sample complexity. 

A further generalization has been investigated by Rozenberg [20] dealing the first 
time with constraints imposed on partitions of pairs, triplets, and so on of the vertices 
on one hand, and the edge densities filtered by these partitions on the other. However 
the edge density constraints in [20] are not partitioning the edge set as in the previous 
approaches, rather layers of partitions corresponding to partitions of [r] for r-graphs are 
considered. Let us illustrate the framework for 3-graphs with the partitioning understood 
as coloring. In [9] the number of edges whose vertices have certain colors are constrained, 
in [20] also the number of edges can be constrained that fulfill the condition that a pair of 
vertices (as a tuple) has a certain color and the third vertex (as a singleton) has also some 
other color. However, in [20] only colorings disjoint subsets of the r-edges are allowed 
to yield a constraint, for instance it is not possible to have a condition on the number of 
pair-monochromatic edges, that is, 3-edges whose three underlaying pairs have the same 
color. The positive result obtained in [20] is also somewhat weaker than testability, the 
term pseudo-testability is introduced in order to formalize the conclusion. 

Our approach allows for more general constraints on edge densities. 

Definition 6.3. Let O denote the set of all maps (p that are assigning to each element of the 
set of proper subsets of [r], h([r], r - 1), a color [k]. We define a density tensor by xp = 
({p-)seh(M,r-i)/ {p(p)cpeo}, where each component is in [0,1]. 

Let H be an r-graph with vertex set V = V{H) of cardinality n and for each 1 < s < r -1 let 
P{s) be a partition o/(^) into k parts, and let P = (;P(s))g”]. Then the density tensor corresponding 
to the pair (H, P) is given by 

pfH,P)J^ for all seh{[k],r-l), 

citid 

\{e ^ [nfle e H and pA{e) e Pcj,{A){\A\) for all A e h{[rl r - 1)}\ ^ . 

pp{H,P) = -- for all (j)GO, 

where v is the set that consists of the components of the vector v. We say that H satisfies a density 
tensor if there exists a collection of partitions P of its vertex tuples as above so that the tensor 
yielded by the pair (H, P) is equal to ip. 

We remark that the above partition property is non-hereditary. An application of our 
main result yields the following corollary. 

Corollary 6.4. For any r > 1 and a density tensor xp = {{p^j)seh([kir-i), {T<p)ct)e<s?), the partition 
property given by the tensor is testable. 
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6.2 Logical formulas 

The characterization of testability in terms of logical formulas was initiated by Alon et al. 
[ 1 ] who showed that properties expressible by certain first order formulas are testable, 
while there exists some first order formulas that generate non-testable properties. The 
result can be formulated as follows. 

Theorem 6.5. [1 ] Let l,k>\ and (pbea quantifier-free first order formula ofarity I + k containing 
only adjacency and equality. The graph property given by the truth assignments of the formula 
3ui ,..., ui'ivi,Vk(p{ui,... ,ui,Vi,..., Vk) with the variables being vertices is testable. 

Without going into further details at the moment we mention that any 3V property of 
graphs is indistinguishable by a tester from the existence of a node-coloring that is proper 
in the sense that the colored graph does not contain subgraphs of a certain set of forbidden 
node-colored graphs. 

Our focus is directed at the positive results of [1], those were generalized into two 
directions.First, by Jordan and Zeugmann [13] to relational structures in the sense that 
(p can contain several r-ary relations with even r > 3 whereas the 3V prefix remains the 
same concerning vertices. Secondly, by Lovasz and Vesztergombi [18] to a restricted class 
of second order formulas, where existential quantifiers for 2 -ary relationships are added 
ahead of the above formula in Theorem 6.5 so that they can be included in (p, see Corollary 
4.1 in [18]. Our framework allows for extending these results even further. 

Corollary 6.6. Let ri,..., r^, /, > 1 be arbitrary, and let r = max r,-. For any r-graph property 
that is expressible by the truth assignments of the second order formula 

3T;[,...,T/7j3ui, ..., UiLv \,..., Vj^cpi^Li, ..., L^^, U \,..., Ui, V \,..., vPj, (6.1) 

where Li are symmetric rrary predicate symbols and Ui,...,Ui,Vi,...,Vk are nodes, and (p is 
a quantifier-free first order expression containing adjacency, equality, and the symmetric r-ary 
predicates Li for each i e [m] is testable. 

Proof (Sketch). We first note that any collection of the relations Li,..., can be encoded 
into one edge-colored r-uniform hypergraph with at most 2 '''" colors with an additional 
compatibility requirement. An edge color for e e (^j}) consists of a 2'' - 1-tuple correspond¬ 
ing to non-empty subsets of [r], where the entry corresponding to S c [r] is determined 
by the evaluation of psie) in the relations Li that have arity |S|. We can reconstruct the 
predicates from a coloring whenever the color of any pair of edges e and e' is such that 
their entries corresponding to the power set ot e De' coincide, for r = 2 this means some 
combinations of colors (determined by a partition of the colors) for incident edges are 
forbidden. This compatibility criteria for 2'''”-colored r-graphs is known to be a testable 
property, from here on this will be seen as a default condition. 

For a fixed tuple Li,..., of relations of arity at most r the property corresponding 
to the first order expression Vz;i,..., Vk(p{Li,..., L^, Vi,..., vf) is equivalent to the property 
of 2 ''’"-colored r-graphs that is defined by forbidding certain subgraphs of size at most k. 
This is testable by the following theorem of Austin and Tao [4] that generalizes the result 
of Rodl and Schacht [19]. 
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Theorem 6.7. [4] For any r,k> 1, every hereditary property ofk-colored r-graphs is testable. 

We sketch now that the properties corresponding to the more general formula (6.1) 
in the statement of the corollary are indistinguishable from the existence of a further 
node-coloring on top of the edge-colored graphs such that no subgraph appears from a 
certain set of forbidden subgraphs. We follow the argument of [1] (see also [13], and [18]). 
Two properties are said to be indistinguishable in this sense whenever for every e > 0 
there exists an Wq = no{£) such that any graph on n > no vertices that has one property 
can be modified by at most en'' edge additions or removals to obtain a graph that has the 
other property, and vice versa. The testability behavior of the two properties is identical. 
Consider as fixed, then the property of 2’''”-colored r-graphs corresponding to 

3ui,..., Uiivi,Vk(p{Li ,..., Lm, Wi,..., Ui, Ui,..., Vk) is indistinguishable to from the exis¬ 
tence of the following proper coloring. Every node gets either color (0,0) or {a, b), where 
a represents an 2 ''’"-colored r-graph on I nodes, and b represents an Z-tuple of 2 ''’"-colored 
edges. A coloring is proper if there are at most I nodes colored by (0,0), further for any 
other color appearing the first component a is identical. Now a colored subgraph of size 
k is forbidden if considering the edge-colored graph on 1 / = {vi,... ,Vk} (without node 
colors) supplemented by a graph on {ui ,..., U/j together with their coimection to V given 
by the node colors on V the evaluation of the formula (p(Li ,..., L^, Wi,..., U/, Ui,..., Vk) is 
false. 

It is not hard to see that for this coloring property Theorem 6.7 applies since it is 
hereditary, therefore it is testable. Now if we let Li,..., to be arbitrary and apply 
Theorem 1.3, then we obtain the testability of the property given by (6.1) in the statement 
of the corollary. □ 

6.3 Estimation of the distance to properties 

We can also express the property of being close to given property in the nondeterministic 
framework, and can show the testability here. This problem was introduced first for 
graphs by Fischer and Newman [8], in this paper the authors show the equivalence of 
testability and estimability of the distance of a property, in [18] one direction of this was 
reproved for graphs. To our knowledge the generalization for r-graphs has not been 
considered yet. Recall that di is the edit distance. 

Corollary 6.8. For any r > 1, testable r-graph property P and real c> 0 the property di{., P) < c 
is testable. 

Proof. The proof is identical to the one given in [18], for any r > 1, testable r-graph property 
P and real c > 0 a testable property of 4-colored r-graphs that witnesses the property of 
di{.,P) < c. Let G be an arbitrary r-graph, then we consider the 2-colorings of G where 
(1,1) and (1,2) color the edges of G, and (2,1) and (2,2) the non-edges. The 4-colored 
witness property Q is then that the edges with the colors (1,1) and (2,1) together form a 
member of P, and additionally there are at most cn'' edges colored by (1,2) or (2,1). The 
property Q is trivially testable, therefore Theorem 1.3 implies the statement. □ 
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7 Further research 


The general upper bound given in Theorem 1.3 is dependent on the order r, it would 
be interesting to see if it is possible to remove this dependence in a similar way as it 
was shown in the special case of linearly ND-fesfable paramefers. A more ambitious 
goal would be fo transform effective bounds info efficienf if possible. We mean by this 
the verification that the sample complexity of the original parameter or property is of 
the same magnitude (up to polynomial dependence) as the sample complexity of the 
witness parameter. Currently no non-trivial lower bound on the sample complexity in 
our framework is known, in the original dense property testing setting there are some 
properties that admit no tester that only makes a polynomial number of queries, such 
as triangle-freeness and other properties defined by forbidden families of subgraphs or 
induced subgraphs. 

The partition problems described in Section 6 had lead to further applications in the 
graph case, this development was presented in [9]. As mentioned, the framework of 
[9] also deal! wifh fuples of hypergraphs extending the result of [11], this enabled the 
analysis of the number of 4-cycles appearing in the bipartite graphs induced by the pairs 
of the partition classes instead of only observing the edge density by means of adding an 
auxiliary 4-graph to the simple graph. An alternative characterization of the notion of a 
regular biparfife graph says thaf a pair of classes is regular if and only if the number of 
4-cycles spanned by them is minimal, with other words their density is approximately 
the fourth power of the edge density. Using this together with the result regarding the 
testability of partition problems of [9] the authors there were able to show that satisfying a 
certain regularity instance is also testable. This achievement in turn imply an algorithmic 
version of the Regularity Lemma. In this manner. Corollary 6.4 might be of further 
use for testing regular partitions of r-uniform hypergraphs by utilizing concepts that 
emerged during the course of research towards an algorithmic version of the Hypergraph 
Regurality Lemma (see for example Maxell et al. [12]) in a similar way to the approach in 
[9]. 

On a further thought, one may depart from fhe setting of dense r-graphs in favor of 
other classes of combinatorial objects in order to define and sfudy their ND-testability. 
Such are for example semi-algebraic hypergraphs thaf admit a regularity lemma that 
produces a polynomial number of classes as a function of the multiplicative inverse of the 
proximity parameter, thus they are good candidates for an improvemenf on the sample 
complexity. 

Finally we mention a possible direction for further study towards the characterization 
of locally repairable properties, see [4], that appears to be promising. This characteristic 
is stronger than testability in that respect that in this setup there should exist a local 
edge modifying algorithm applied to graphs that are close to the property that observes 
only some piece of bounded size of the graph and its connection to single vertex pairs 
and decide upon adjacency depending only on this information. The output of this 
algorithm should be a graph that is close to the input and actually satisfies the property. 
We may define nondeferministically locally repairable properties in a straight-forward 
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way analogous to ND-testing by requiring a certain locally repairable property of edge- 
colored graphs that reduces to the original property after the discoloring procedure. It has 
been established in [4] that hereditary graph properties are locally repairable, but there 
are examples of hereditary properties of direcfed graphs and 3-graphs thaf are testable, 
but not locally repairable. It would be compelling to investigate analogous problems 
concerning nondeterministically locally repairable properties. 
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